Comparing the current short-term cancer incidence prediction models in Brazil with state-of-the-art time-series models

The World Health Organization has highlighted that cancer was the second-highest cause of death in 2019. This research aims to present the current forecasting techniques found in the literature, applied to predict time-series cancer incidence and then, compare these results with the current methodology adopted by the Instituto Nacional do Câncer (INCA) in Brazil. A set of univariate time-series approaches is proposed to aid decision-makers in monitoring and organizing cancer prevention and control actions. Additionally, this can guide oncological research towards more accurate estimates that align with the expected demand. Forecasting techniques were applied to real data from seven types of cancer in a Brazilian district. Each method was evaluated by comparing its fit with real data using the root mean square error, and we also assessed the quality of noise to identify biased models. Notably, three methods proposed in this research have never been applied to cancer prediction before. The data were collected from the INCA website, and the forecast methods were implemented using the R language. Conducting a literature review, it was possible to draw comparisons previous works worldwide to illustrate that cancer prediction is often focused on breast and lung cancers, typically utilizing a limited number of time-series models to find the best fit for each case. Additionally, in comparison to the current method applied in Brazil, it has been shown that employing more generalized forecast techniques can provide more reliable predictions. By evaluating the noise in the current method, this research shown that the existing prediction model is biased toward two of the studied cancers Comparing error results between the mentioned approaches and the current technique, it has been shown that the current method applied by INCA underperforms in six out of seven types of cancer tested. Moreover, this research identified that the current method can produce a biased prediction for two of the seven cancers evaluated. Therefore, it is suggested that the methods evaluated in this work should be integrated into the INCA cancer forecast methodology to provide reliable predictions for Brazilian healthcare professionals, decision-makers, and oncological researchers.

1. Breast and lung cancer incidence predictions have garnered more attention in specialized literature and have been studied in 8 and 7 works, respectively; colorectal cancer has been studied in 5 works, while other cancer types have been studied in 4 works or less.2. CSM and particularly ARIMA were the most used approaches.3. Considering SSM and MLM, TBATS NNETAR, and MLP were never covered before in previous research.4. We found no previous work in which all three classes of models were applied.
As will be presented in this paper, the third and fourth conclusions allow us to state that this work covers a gap in current cancer prediction.Thus, applying unseen methods (3rd) and the three classes of models (4th) to cancer prediction is an original contribution of this research.
Finally, the mentioned studies address the application of different forecasting methods in countries such as Canada, Switzerland, Fiji, China, Malaysia, and Romania.Their use in Brazil, for a larger sample of types of cancer and comparing them, seems like a complementary contribution.Childhood cancer 19  14   Skin melanoma and others

Forecasting models applied
In this research we apply the univariate forecasting methods available in Hyndman and Khandakar 26 , Petris 27 and Kourentzes 28 .Models applied in next sections are presented in Table 5.These models were implemented in R 29 language (version 4.1.3)and the code used is available at Supplementary Material (Forecasting code.R).
To build each model is necessary to estimate many parameters, but the main features of each model are presented forward: • ETS: ETS is a class of models that essentially works with three components equations level ( l t ), trend ( b t ) and season ( s t ) to explain the original time series variable ( y t ) that we aim to forecast.In each model these components cannot be significant, also known as None (N) or can be significant and better described y t as    30 .
The difference between ARIMA and SARIMA models remains on the same components appearing lagged by the length of seasonal time window (frequency) as P, D and Q.For more details see Hyndman and Athanasopoulos 9 and Kotu and Deshpande 30 .• Kalman filter (KF): KF methods search the smallest vector that summarizes the past of the system that better describes the state of a deterministic dynamic system 31 .KF equation is basically composed by a linear autoregressive equation x(t) = A * x(t) + W(t) where W(t) ≈ N(0, Q) with a measurement that is y(t) = C * y(t) + V (t) where V (t) ≈ N(0, R) that defines the linearized process in which y(t) ∈ R .The random variables W(t) and V(t) are assumed to be independent of each other and both must follow a normal distribution.
• TBATS: TBATS model is Trigonometric Seasonal (T) Exponential Smoothing Method + Box-Cox Trans- formation + ARMA model for residuals (BATS).Equations of the TBATS model are presented in equations below where ω and φ are Box-Cox and the damping parameters respectively, ARMA(p, q) process model the error and m 1 to m J list the seasonal periods used while k 1 to k J are the corresponding number of Fourier terms used.For more details see De Liveira et al. 32 .www.nature.com/scientificreports/ • NNETAR: Neural Network Time Series Forecasts (NNETAR) is a class of feed-forward neural networks with a single hidden layer and lagged inputs.This model works with 2 (for non seasonal time-series) or 3 (for seasonal time-series) parameters: the number of past observations used as input layers (p), the number of past observations lagged by the length of seasonal time window used as input layers (P) and the number of neurons (k) in the single layer.In this research, a total of 20 repeats networks are fitted, each with random starting weights.These are then averaged when computing forecasts.The network is trained for one-step forecasting.Multi-step forecasts are computed recursively.The k selected to each type of cancer it the half of the number of input nodes plus 1.For non-seasonal data, the fitted model is denoted as an NNAR (p, k) (Neural Network Autoregressive) model which is analogous to an AR (p) model but with nonlinear functions.For seasonal data, the fitted model is called an NNAR (p, P, k)[m] model, which is analogous to an ARIMA (p, 0, 0)(P, 0, 0)[m] model but with nonlinear functions.For more details see Hyndman and Athanasopoulos 9 .• MLP: MLP is an extension of feed-forward neural network where an arbitrary number of hidden layers that are placed in between the input and output layer (the truly computational engine of the MLP).According to Kourentzes et al. 33 , MLPs are designed to approximate any continuous function and can solve problems which are not linearly separable.In our case, the time-series problem proposed our input layer (like NNETARs' model p) are the most recent past observations and we set the MLP model to choose the best number of input layers between 1 and the prediction length (3 years) lags will be used according to Mean Square Error.The same criteria were also adopted to choose the number of hidden nodes in each hidden layer.For more details see Kourentzes et al. 33 .

Forecasting models evaluation
The dataset presented in Table 2 were multiplied by I/M ratio for each cancer type shown in Table 4 to estimate the incidence rate of each type of cancer evaluated (Fig. 3).
For instance, to Breast cancer, the ICD-10 Mortality rate by 100,000 inhabitants are 17,77 in 1979, 21.73 in 1980 and so on (second column Table 2).Thus, the Breast cancer Incidence rate-ajusted will be these values multiplied by 5.59 (Breast cancers' I R /M 0 ratio in Table 4) which are 77.65 in 1979, 94.96 in 1980, 69 in 1981 and so on that can be seen in Fig. 3.
In this research, we are interested in provide a comparison between Brazilian's current short-term cancer prediction and the time-series state of art models.As mentioned in Section Theoretical Background, as long as the current short-term cancer prediction are made 3 years ahead, we split our dataset into training data (from 1979 to 2017) and test data (from 2018 to 2020).
Training (in sample) and test (out of sample) data are evaluated using the Root Mean Square Error (RMSE) criterion.A low RMSE in sample value indicates a good average fit of the model used while a low value of RMSE out of sample indicates that the model used, on average, delivers a reliable forecast 9 .
Below we present the criteria adopted to evaluate the current and proposed methods predictions to each cancer type:  If the residuals produced a 0 mean error in Student-test, follows a normal distribution in Shapiro-Wilk test, remains between the interval defined by the blue lines in ACF plot test to all lags and presented no constant variance all over the time (homoscedasticity) in Breusch-Pagan test, we consider that the model residuals produced a white noise which means that the model is unbiased [34][35][36][37][38] .
The significance level adopted in this research is 0.05 which means that residuals produced a white noise if the obtained p-values in each test are higher than 0.05 to each model.Thus, in this research we consider that the best model for each cancer type is given by their residual evaluation that (1) fulfill all requirements previously presented and (2) obtained the lowest out of sample RMSE.

Results
In this section we apply the methods presented in columns of Table 5 to each type of cancer incidence presented in Fig. 3.In Table 6 we summarize the in sample and out of sample RMSE results by model and type of cancer.
As mentioned in Forecasting models evaluation section, to compare models errors summarized in Table 6 we select the out of sample RMSE criterion.Then, to ensure that models residuals give us a white noise in the training data we apply the Student test (Table 7), the ACF plot, the Shappiro-Wink normality test (Table 8) and the Breusch-Pagan test (Table 9).Thus, the best model to each cancer type are: NNETAR for breast, KF for colorectal, ARIMA for prostate, TBATS for lung, KF for cervical, the current method for Head and neck and KF for childhood.
Their prediction plots can be seen respectively in Figs. 12, 13, 14, 15, 16, 17 and 18.The 3-year ahead prediction values are summarized in Table 11 Discussion A limitation of this research could be observed in the method used to obtain the incidence of cancer in Brazil.This occurs because, in practice, the incidence is not measured.Thus, we used cancer incidence estimation methodologies proposed in Black et al. 23 , Ferlay et al. 24 and Ferlay et al. 25 which are based on the mortality rate discussed in Section Data collection.Considering that the presented methodologies can give us the best cancer incidence estimation evaluating only time-series univariate models, our findings in Table 6 seem to indicate that the current model applied by INCA in Brazil to forecasting cancer incidence underperform in 6 of the 7 type of cancers proposed in this research.So, the presented methodologies seem to behave more adequately than the Brazilian's current methodology.
It is important to note that we are working with the same type and amount of data that is used today, meaning that it would not be necessary to collect new variables in order to increase the accuracy of the forecast.
In addition, we did not see the CSM models outperform the others in any type of cancer, although ARIMA models (CSM) are the most widely used models in the current literature so far as we presented in Table 1.
These facts imply that, while there is no broad and reliable Population-Based Cancer Registries in the country, all research that use these data as a primary source will be limited; including this one.However, it is necessary to consider that Brazil has continental dimensions and a technological backwardness that do not facilitate the implementation of this type of record.Although restrictive, the fact has not prevented research and public policies aimed to cancer prevention and control in the country, that surely could be more effective.
In this sense, we reinforce that it is not possible to invalidate what has been done in the country, but to plead for the opening of space so that new, more accurate forecast models can be adopted, aiming at supporting strategic decisions to face cancer in the country.Even because the current literature has used models that go in the opposite direction of the results presented by this research in Table 1.
For instance, MLM models were only used in Soltani et al. 14 and Alrobai and Jilani 15 works and only LTSM were evaluated.Considering SSM, the current literature presents only Lee et al. 10 research in which only KF approach is proposed.In Table 11, we see that SSM (KF and TBATS) was selected in four of seven type of cancers evaluated while MLM (NNETAR), CSM (ARIMA) and current method where selected to one type of cancer.
The evaluation process adopted in this research and presented in Section Forecasting models evaluation was crucial to identify and discard biased models to each type of cancer.If we had only considered in sample RMSE criterion (measuring the best fitted model, on average) to select the models to each type of cancer, MLP would be selected in all time-series evaluated.
On the other hand, if we considered only out of sample RMSE criterion (measuring the best predicted values, on average), ARIMA and MLP would be selected in two types of cancer while ETS, TBATS and KF would be selected in only one type of cancer time-series (NNETAR and current method would not be selected).The noise evaluation process adopted also allowed us to state that the current model can potentially provide a biased prediction because it failed in ACF plot to Breast and Cervical cancer as we can see in Fig. 4. Therefore, we cannot classify it as statistically valid for making predictions.
It is important to note that both cancers affects the female population and keep using the current method could jeopardize efficient planning of resources for diagnosis and treatment for them.
Considering that, in Brazil, government policies and programs are mostly focused on these types of cancer the situation may pose an important challenge to be overcome.www.nature.com/scientificreports/Finally, by evaluating Brazilian's current approach, CSM, SSM and MLM using four exclusion criteria (mean 0, normality, ACF and homoscedasticity tests) and one decision criteria (lowest out of sample RMSE) we were able to establish the best unbiased model to each type of cancer, as we wanted to illustrate.We also emphasize that by comparing different methods we can potentially improve the main issue addressed in this research: how to provide an unbiased and reliable cancer forecasting.
Although it is not the focus of this research, causal and multivariate time-series models associated with other control variables such as cigarette smoking as a predictor of lung cancer and HPV vaccination coverage for cervical cancer should be investigated.Another promising direction is to investigate age-period-cohort (APC) models and combine them with the time-series models proposed in this research.

Conclusions
This research aimed to present and apply the main time-series-based models available in forecasting literature to the seven most prevalent types of cancer in Brazil.These models fall into three classes: classical statistical models, State-Space models, and machine learning models.
As mentioned in Theoretical Background section, it is the first attempt to apply unseen methods (TBATS, NNETAR and MLP) and the three classes of models to cancer prediction.
In Brazil, the incidence of cancer is not directly measured and must be estimated based on the mortality rate.Despite the challenge of not directly measuring cancer incidence, it is crucial for public health systems to estimate the incidence of a disease that ranks second in terms of mortality rate per 100,000 inhabitants.
While acknowledging the issue of not directly measuring incidence, our research mitigates this concern by utilizing the same data and employing the same cancer incidence estimation methods.This consistency ensures that our comparison between Brazil's current prediction method and our proposed methods remains valid.
We also contributed to fulfill a literature gap identified in Table 1 by applying TBATS, MLP and NNETAR forecasting techniques predict seven cancer types in a Brazilian district.
Furthermore, we did not find any similar studies that compared the results of three classes of univariate timeseries forecasting models or addressed more than one type of cancer.
When comparing only the error results (RMSE in sample and out of sample) between the approaches mentioned above and the current technique, we demonstrated that the current method underperforms for all types of cancer tested.
Moreover, in the Discussion section, we illustrated that, for breast and cervical cancers, the current approach applied in Brazil produced biased residuals, potentially affecting the quality and reliability of cancer incidence predictions in this country.Consequently, it may provide inaccurate information to healthcare decision-makers.
Therefore, we suggest that the methods evaluated in this study should be integrated into Brazil's cancer forecast methodology to provide a reliable prediction for healthcare decision-makers.
To further researches, we also suggest a comparison between MLM time-series approaches.NNETAR and MLP (covered in this research) with LTSM which had been also used in recent previous works like Soltani et al. 14 and Alrobai and Jilani 15 presented in Table 1.
Although it was not the focus of this research, it should be noted that age-period-cohort (APC), previously mentioned in Section Theoretical Background, and Ensemble APC analysis as well as considering the birthcohort effects 39,40 have potential to provide more accurate forecasts compared to traditional time-series methods that only consider period components.
Finally, by contributing with a proposal for the application of a set of tested forecasting methods to estimate the incidence of cancer in Brazil, it is intended that the results encourage a discussion on the adoption of anticipatory actions, aimed at prevention and the provision of means and resources for the early detection of the most prevalent types of cancer.
In this sense, to provide more robust predictions causal models could be also taking into account like we can see in [41][42][43][44][45][46][47] applied to other diseases.Using them it is possible to evaluate the impact of smoking reduction or HPV vaccines strategies for lung and cervical cancer respectively, for instance.

Figure 1 .
Figure 1.ICD-10 Mortality rate by 100,000 inhabitants considering world population-adjusted by cancer type.
The noise evaluation over the training (in sample) data according to the following tests: student (ST), normal- ity (NT), Auto-correlation function (ACF) plot and Breusch-Pagan (BPT); • The error evaluation according to the test (out of sample) Root Mean Square Error (RMSE).

Figure 5 .
Figure 5. Breast cancer noise evaluation by model.

Figure 6 .
Figure 6.Colorectal cancer noise evaluation by model.

Figure 7 .
Figure 7. Prostate cancer noise evaluation by model.

Figure 8 .
Figure 8. Lung cancer noise evaluation by model.

Figure 9 .
Figure 9. Cervical cancer noise evaluation by model.

Figure 10 .
Figure 10.Head and Neck cancer noise evaluation by model.

Figure 11 .
Figure 11.Childhood cancer noise evaluation by model.

Figure 13 .
Figure 13.KF colorectal cancer IRa fitted and prediction values.

Figure 14 .
Figure 14.ARIMA prostate cancer IRa fitted and prediction values.

Figure 16 .
Figure 16.KF cervical cancer IRa fitted and prediction.

Figure 17 .
Figure 17.Current method head and neck cancer IRa fitted and prediction values.

Figure 18 .
Figure 18.KF Childhood cancer IRa fitted and prediction values.

Table 1 .
Forecasting model applied by cancer type.Source: The authors.

Table 2 .
ICD-10 Mortality rate by 100,000 inhabitants considering world population-adjusted by cancer type.

Table 3 .
Filters and criteria used to retrieve cancer data by cancer type.

Table 4 .
I/M ratio by cancer type.

Table 5 .
Forecasting models applied in this research.

Table 6 .
RMSE per type of cancer per model.

Table 7 .
Student test p value per type of cancer per model.

Table 8 .
Normality test p value per type of cancer per model.

Table 10 .
White noise failure evaluation summary per type of cancer per model.

Table 11 .
Three years IRa prediction using the best model to each cancer type.